持续深度学习的领域是一个新兴领域,已经取得了很多进步。但是,同时仅根据图像分类的任务进行了大多数方法,这在智能车辆领域无关。直到最近才提出了班级开展语义分割的方法。但是,所有这些方法都是基于某种形式的知识蒸馏。目前,尚未对基于重播的方法进行调查,这些方法通常在连续的环境中用于对象识别。同时,尽管无监督的语义分割的域适应性获得了很多吸引力,但在持续环境中有关域内收入学习的调查并未得到充分研究。因此,我们工作的目的是评估和调整已建立的解决方案,以连续对象识别语义分割任务,并为连续语义分割的任务提供基线方法和评估协议。首先,我们介绍了类和域内的分割的评估协议,并分析了选定的方法。我们表明,语义分割变化的任务的性质在减轻与图像分类相比最有效的方法中最有效。特别是,在课堂学习中,学习知识蒸馏被证明是至关重要的工具,而在域内,学习重播方法是最有效的方法。
translated by 谷歌翻译
Compared to regular cameras, Dynamic Vision Sensors or Event Cameras can output compact visual data based on a change in the intensity in each pixel location asynchronously. In this paper, we study the application of current image-based SLAM techniques to these novel sensors. To this end, the information in adaptively selected event windows is processed to form motion-compensated images. These images are then used to reconstruct the scene and estimate the 6-DOF pose of the camera. We also propose an inertial version of the event-only pipeline to assess its capabilities. We compare the results of different configurations of the proposed algorithm against the ground truth for sequences of two publicly available event datasets. We also compare the results of the proposed event-inertial pipeline with the state-of-the-art and show it can produce comparable or more accurate results provided the map estimate is reliable.
translated by 谷歌翻译
With the advent of deep learning application on edge devices, researchers actively try to optimize their deployments on low-power and restricted memory devices. There are established compression method such as quantization, pruning, and architecture search that leverage commodity hardware. Apart from conventional compression algorithms, one may redesign the operations of deep learning models that lead to more efficient implementation. To this end, we propose EuclidNet, a compression method, designed to be implemented on hardware which replaces multiplication, $xw$, with Euclidean distance $(x-w)^2$. We show that EuclidNet is aligned with matrix multiplication and it can be used as a measure of similarity in case of convolutional layers. Furthermore, we show that under various transformations and noise scenarios, EuclidNet exhibits the same performance compared to the deep learning models designed with multiplication operations.
translated by 谷歌翻译
Recurrent neural networks (RNN) are the backbone of many text and speech applications. These architectures are typically made up of several computationally complex components such as; non-linear activation functions, normalization, bi-directional dependence and attention. In order to maintain good accuracy, these components are frequently run using full-precision floating-point computation, making them slow, inefficient and difficult to deploy on edge devices. In addition, the complex nature of these operations makes them challenging to quantize using standard quantization methods without a significant performance drop. We present a quantization-aware training method for obtaining a highly accurate integer-only recurrent neural network (iRNN). Our approach supports layer normalization, attention, and an adaptive piecewise linear (PWL) approximation of activation functions, to serve a wide range of state-of-the-art RNNs. The proposed method enables RNN-based language models to run on edge devices with $2\times$ improvement in runtime, and $4\times$ reduction in model size while maintaining similar accuracy as its full-precision counterpart.
translated by 谷歌翻译
GTFLAT, as a game theory-based add-on, addresses an important research question: How can a federated learning algorithm achieve better performance and training efficiency by setting more effective adaptive weights for averaging in the model aggregation phase? The main objectives for the ideal method of answering the question are: (1) empowering federated learning algorithms to reach better performance in fewer communication rounds, notably in the face of heterogeneous scenarios, and last but not least, (2) being easy to use alongside the state-of-the-art federated learning algorithms as a new module. To this end, GTFLAT models the averaging task as a strategic game among active users. Then it proposes a systematic solution based on the population game and evolutionary dynamics to find the equilibrium. In contrast with existing approaches that impose the weights on the participants, GTFLAT concludes a self-enforcement agreement among clients in a way that none of them is motivated to deviate from it individually. The results reveal that, on average, using GTFLAT increases the top-1 test accuracy by 1.38%, while it needs 21.06% fewer communication rounds to reach the accuracy.
translated by 谷歌翻译
DeepAngle is a machine learning-based method to determine the contact angles of different phases in the tomography images of porous materials. Measurement of angles in 3--D needs to be done within the surface perpendicular to the angle planes, and it could become inaccurate when dealing with the discretized space of the image voxels. A computationally intensive solution is to correlate and vectorize all surfaces using an adaptable grid, and then measure the angles within the desired planes. On the contrary, the present study provides a rapid and low-cost technique powered by deep learning to estimate the interfacial angles directly from images. DeepAngle is tested on both synthetic and realistic images against the direct measurement technique and found to improve the r-squared by 5 to 16% while lowering the computational cost 20 times. This rapid method is especially applicable for processing large tomography data and time-resolved images, which is computationally intensive. The developed code and the dataset are available at an open repository on GitHub (https://www.github.com/ArashRabbani/DeepAngle).
translated by 谷歌翻译
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译
The availability of Martian atmospheric data provided by several Martian missions broadened the opportunity to investigate and study the conditions of the Martian ionosphere. As such, ionospheric models play a crucial part in improving our understanding of ionospheric behavior in response to different spatial, temporal, and space weather conditions. This work represents an initial attempt to construct an electron density prediction model of the Martian ionosphere using machine learning. The model targets the ionosphere at solar zenith ranging from 70 to 90 degrees, and as such only utilizes observations from the Mars Global Surveyor mission. The performance of different machine learning methods was compared in terms of root mean square error, coefficient of determination, and mean absolute error. The bagged regression trees method performed best out of all the evaluated methods. Furthermore, the optimized bagged regression trees model outperformed other Martian ionosphere models from the literature (MIRI and NeMars) in finding the peak electron density value, and the peak density height in terms of root-mean-square error and mean absolute error.
translated by 谷歌翻译
这项研究开发了一个无人驾驶系统(UASS)的框架,以监测高层建筑项目中未受保护的边缘和开口附近的跌落危险系统。开发并测试了一个三步基于机器学习的框架,以检测UAS捕获的图像的护栏柱。首先,对护栏探测器进行了培训,以定位支撑护栏的职位的候选位置。由于从实际的工作现场收集的此过程中使用了图像,因此确定了几个错误检测。因此,在以下步骤中引入了其他约束,以滤除错误检测。其次,研究团队将水平线检测器应用于图像,以正确检测地板并删除离地板不近的检测。最后,由于每个帖子之间安装了护栏柱,它们之间的分布差异大致,因此它们之间的空间被估算并用于找到两个帖子之间最有可能的距离。研究团队使用了开发方法的各种组合来监视高层建筑项目的捕获图像中的护栏系统。比较精度和召回指标表明,级联分类器通过落地检测和护栏间距估计来取得更好的性能。研究结果表明,拟议的护栏识别系统可以改善护栏的评估,并促进安全工程师确定高层建筑项目中跌落危害的任务。
translated by 谷歌翻译
我们提供了一种单发图像合成的方法,该方法可以通过倒置配备有强正规化器的准稳定分类器来控制单个图像的操作。我们提出的标题为“魔术”的方法是从预先训练的准稳定分类器中的结构化梯度,以更好地保留输入语义,同时保留其分类精度,从而确保合成中的信誉。与当前使用复杂原语的当前方法来监督该过程或使用注意图作为弱监督信号,魔术汇总了输入上的梯度,这是由导向二进制掩码驱动的,该导向二进制掩码可以实施强大的空间先验。魔术在一个框架上实现了一系列的操作,以实现形状和位置控制,强烈的非刚性形状变形,并在存在重复对象的情况下复制/移动操作,并通过仅需指定二进制指南掩码来使用户对综合的企业控制。我们的研究和发现得到了与最新图像的各种定性比较,从成像网和使用机器感知进行定量分析的相同图像以及对100多名参与者的用户调查来认可我们的合成质量。
translated by 谷歌翻译